Veracruz
Adaptive Margin RLHF via Preference over Preferences
Chittepu, Yaswanth, Singhal, Prasann, Durrett, Greg, Niekum, Scott
Margin-based optimization is fundamental to improving generalization and robustness in classification tasks. In the context of reward model learning from preferences within Reinforcement Learning from Human Feedback (RLHF), existing methods typically rely on no margins, fixed margins, or margins that are simplistic functions of preference ratings. However, such formulations often fail to account for the varying strengths of different preferences, for example some preferences are associated with larger margins between responses, or they rely on noisy margin information derived from ratings. We argue that modeling the strength of preferences can lead to better generalization and more faithful alignment. Furthermore, many existing methods that use adaptive margins assume access to accurate preference scores, which can be difficult for humans to provide reliably. We propose an approach that leverages preferences over preferences, that is annotations indicating which of two preferences reflects a stronger distinction. We use this ordinal signal to infer adaptive margins on a per-datapoint basis. We introduce an extension to Direct Preference Optimization (DPO), DPO-PoP, that incorporates adaptive margins from preference-over-preference supervision, enabling improved discriminative and generative performance. Empirically, our method outperforms vanilla DPO, DPO with fixed margins, and DPO with ground-truth margins on the UltraFeedback dataset. Additionally, we show that there is a tradeoff between discriminative and generative performance: improving test classification accuracy, particularly by correctly labeling weaker preferences at the expense of stronger ones, can lead to a decline in generative quality. To navigate this tradeoff, we propose two sampling strategies to gather preference-over-preference labels: one favoring discriminative performance and one favoring generative performance.
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > Mexico > Veracruz (0.04)
- North America > United States > New York (0.04)
- (4 more...)
A symbolic Perl algorithm for the unification of Nahuatl word spellings
Guzmán-Landa, Juan-José, Vázquez-Osorio, Jesús, Torres-Moreno, Juan-Manuel, Torres, Ligia Quintana, Figueroa-Saavedra, Miguel, Avendaño-Garrido, Martha-Lorena, Ranger, Graham, Velázquez-Morales, Patricia, Martínez, Gerardo Eugenio Sierra
In this paper, we describe a symbolic model for the automatic orthographic unification of Nawatl text documents. Our model is based on algorithms that we have previously used to analyze sentences in Nawatl, and on the corpus called $π$-yalli, consisting of texts in several Nawatl orthographies. Our automatic unification algorithm implements linguistic rules in symbolic regular expressions. We also present a manual evaluation protocol that we have proposed and implemented to assess the quality of the unified sentences generated by our algorithm, by testing in a sentence semantic task. We have obtained encouraging results from the evaluators for most of the desired features of our artificially unified sentences
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Mexico > Veracruz > Xalapa (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.97)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)
David Byrne's Career of Earnest Alienation
At seventy-three, the former front man of Talking Heads is still asking questions about what it means to be alive. "When you step onstage, it's a very artificial situation," Byrne said. "To pretend it's not--that isn't being authentic." If you spend enough time wandering around downtown Manhattan, the odds are that you'll eventually encounter the musician David Byrne riding a bicycle. One day this past June, pedalling alongside Byrne from his apartment in Chelsea to the Governors Island ferry, I watched at least a dozen New Yorkers clock his profile, whipping around to squint, softly pinching the arm of their companion and whispering, "Was that . . . By then, Byrne was gone, a tuft of white hair whizzing toward the horizon. Spotting Byrne on two wheels has become a New York City rite of passage, like sussing out the best halal cart in midtown, or dropping something important onto the subway tracks. During the few months that Byrne and I spent together, I never saw him traverse the ...
- North America > United States > Illinois > Cook County > Chicago (0.04)
- South America > Peru (0.04)
- Oceania > New Zealand (0.04)
- (14 more...)
A First Context-Free Grammar Applied to Nawatl Corpora Augmentation
Guzmán-Landa, Juan-José, Torres-Moreno, Juan-Manuel, Figueroa-Saavedra, Miguel, Quintana-Torres, Ligia, Avendaño-Garrido, Martha-Lorena, Ranger, Graham
In this article we introduce a context-free grammar (CFG) for the Nawatl language. Nawatl (or Nahuatl) is an Amerindian language of the $π$-language type, i.e. a language with few digital resources, in which the corpora available for machine learning are virtually non-existent. The objective here is to generate a significant number of grammatically correct artificial sentences, in order to increase the corpora available for language model training. We want to show that a grammar enables us significantly to expand a corpus in Nawatl which we call $π$-\textsc{yalli}. The corpus, thus enriched, enables us to train algorithms such as FastText and to evaluate them on sentence-level semantic tasks. Preliminary results show that by using the grammar, comparative improvements are achieved over some LLMs. However, it is observed that to achieve more significant improvement, grammars that model the Nawatl language even more effectively are required.
- North America > Mexico > Puebla (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (10 more...)
Discrete Audio Tokens: More Than a Survey!
Mousavi, Pooneh, Maimon, Gallil, Moumen, Adel, Petermann, Darius, Shi, Jiatong, Wu, Haibin, Yang, Haici, Kuznetsova, Anastasia, Ploujnikov, Artem, Marxer, Ricard, Ramabhadran, Bhuvana, Elizalde, Benjamin, Lugosch, Loren, Li, Jinyu, Subakan, Cem, Woodland, Phil, Kim, Minje, Lee, Hung-yi, Watanabe, Shinji, Adi, Yossi, Ravanelli, Mirco
Discrete audio tokens are compact representations that aim to preserve perceptual quality, phonetic content, and speaker characteristics while enabling efficient storage and inference, as well as competitive performance across diverse downstream tasks. They provide a practical alternative to continuous features, enabling the integration of speech and audio into modern large language models (LLMs). As interest in token-based audio processing grows, various tokenization methods have emerged, and several surveys have reviewed the latest progress in the field. However, existing studies often focus on specific domains or tasks and lack a unified comparison across various benchmarks. This paper presents a systematic review and benchmark of discrete audio tokenizers, covering three domains: speech, music, and general audio. We propose a taxonomy of tokenization approaches based on encoder-decoder, quantization techniques, training paradigm, streamability, and application domains. We evaluate tokenizers on multiple benchmarks for reconstruction, downstream performance, and acoustic language modeling, and analyze trade-offs through controlled ablation studies. Our findings highlight key limitations, practical considerations, and open challenges, providing insight and guidance for future research in this rapidly evolving area.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Mexico > Puebla (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (8 more...)
- Leisure & Entertainment (1.00)
- Media > Music (0.93)
- Information Technology (0.67)
- Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Simulating multiple human perspectives in socio-ecological systems using large language models
Zeng, Yongchao, Brown, Calum, Kyriakou, Ioannis, Hotz, Ronja, Rounsevell, Mark
Understanding socio-ecological systems requires insights from diverse stakeholder perspectives, which are often hard to access. To enable alternative, simulation-based exploration of different stakeholder perspectives, we develop the HoPeS (Human-Oriented Perspective Shifting) modelling framework. HoPeS employs agents powered by large language models (LLMs) to represent various stakeholders; users can step into the agent roles to experience perspectival differences. A simulation protocol serves as a "scaffold" to streamline multiple perspective-taking simulations, supporting users in reflecting on, transitioning between, and integrating across perspectives. A prototype system is developed to demonstrate HoPeS in the context of institutional dynamics and land use change, enabling both narrative-driven and numerical experiments. In an illustrative experiment, a user successively adopts the perspectives of a system observer and a researcher - a role that analyses data from the embedded land use model to inform evidence-based decision-making for other LLM agents representing various institutions. Despite the user's effort to recommend technically sound policies, discrepancies persist between the policy recommendation and implementation due to stakeholders' competing advocacies, mirroring real-world misalignment between researcher and policymaker perspectives. The user's reflection highlights the subjective feelings of frustration and disappointment as a researcher, especially due to the challenge of maintaining political neutrality while attempting to gain political influence. Despite this, the user exhibits high motivation to experiment with alternative narrative framing strategies, suggesting the system's potential in exploring different perspectives. Further system and protocol refinement are likely to enable new forms of interdisciplinary collaboration in socio-ecological simulations.
- Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
- North America > United States > New York (0.04)
- North America > Mexico > Veracruz (0.04)
- (5 more...)
- Research Report (1.00)
- Overview (1.00)
- Law (1.00)
- Health & Medicine (1.00)
- Government (1.00)
- Leisure & Entertainment > Games > Computer Games (0.67)
Enhanced Urdu Intent Detection with Large Language Models and Prototype-Informed Predictive Pipelines
Hassan, Faiza, Saleem, Summra, Javed, Kashif, Asim, Muhammad Nabeel, Rehman, Abdur, Dengel, Andreas
Multifarious intent detection predictors are developed for different languages, including English, Chinese and French, however, the field remains underdeveloped for Urdu, the 10th most spoken language. In the realm of well-known languages, intent detection predictors utilize the strategy of few-shot learning and prediction of unseen classes based on the model training on seen classes. However, Urdu language lacks few-shot strategy based intent detection predictors and traditional predictors are focused on prediction of the same classes which models have seen in the train set. To empower Urdu language specific intent detection, this introduces a unique contrastive learning approach that leverages unlabeled Urdu data to re-train pre-trained language models. This re-training empowers LLMs representation learning for the downstream intent detection task. Finally, it reaps the combined potential of pre-trained LLMs and the prototype-informed attention mechanism to create a comprehensive end-to-end LLMPIA intent detection pipeline. Under the paradigm of proposed predictive pipeline, it explores the potential of 6 distinct language models and 13 distinct similarity computation methods. The proposed framework is evaluated on 2 public benchmark datasets, namely ATIS encompassing 5836 samples and Web Queries having 8519 samples. Across ATIS dataset under 4-way 1 shot and 4-way 5 shot experimental settings LLMPIA achieved 83.28% and 98.25% F1-Score and on Web Queries dataset produced 76.23% and 84.42% F1-Score, respectively. In an additional case study on the Web Queries dataset under same classes train and test set settings, LLMPIA outperformed state-of-the-art predictor by 53.55% F1-Score.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Germany > Rhineland-Palatinate > Kaiserslautern (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- (4 more...)
Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code
Bappy, Md. Azizul Hakim, Mustafa, Hossen A, Saha, Prottoy, Salehat, Rajinus
Large Language Models (LLMs) have demonstrated significant capabilities in understanding and analyzing code for security vulnerabilities, such as Common Weakness Enumerations (CWEs). However, their reliance on cloud infrastructure and substantial computational requirements pose challenges for analyzing sensitive or proprietary codebases due to privacy concerns and inference costs. This work explores the potential of Small Language Models (SLMs) as a viable alternative for accurate, on-premise vulnerability detection. We investigated whether a 350-million parameter pre-trained code model (codegen-mono) could be effectively fine-tuned to detect the MITRE Top 25 CWEs specifically within Python code. To facilitate this, we developed a targeted dataset of 500 examples using a semi-supervised approach involving LLM-driven synthetic data generation coupled with meticulous human review. Initial tests confirmed that the base codegen-mono model completely failed to identify CWEs in our samples. However, after applying instruction-following fine-tuning, the specialized SLM achieved remarkable performance on our test set, yielding approximately 99% accuracy, 98.08% precision, 100% recall, and a 99.04% F1-score. These results strongly suggest that fine-tuned SLMs can serve as highly accurate and efficient tools for CWE detection, offering a practical and privacy-preserving solution for integrating advanced security analysis directly into development workflows.
- North America > Mexico > Veracruz (0.04)
- Europe > Greece > Attica > Athens (0.04)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- (2 more...)
HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs
Nguyen, Tin, Bolton, Logan, Taesiri, Mohammad Reza, Nguyen, Anh Totti
An Achilles heel of Large Language Models (LLMs) is their tendency to hallucinate non-factual statements. A response mixed of factual and non-factual statements poses a challenge for humans to verify and accurately base their decisions on. To combat this problem, we propose Highlighted Chain-of-Thought Prompting (HoT), a technique for prompting LLMs to generate responses with XML tags that ground facts to those provided in the query. That is, given an input question, LLMs would first re-format the question to add XML tags highlighting key facts, and then, generate a response with highlights over the facts referenced from the input. Interestingly, in few-shot settings, HoT outperforms vanilla chain of thought prompting (CoT) on a wide range of 17 tasks from arithmetic, reading comprehension to logical reasoning. When asking humans to verify LLM responses, highlights help time-limited participants to more accurately and efficiently recognize when LLMs are correct. Yet, surprisingly, when LLMs are wrong, HoTs tend to make users believe that an answer is correct.
- Europe > Ukraine (0.27)
- Asia (0.27)
- North America > Mexico > Veracruz (0.14)
- (5 more...)
- Research Report (1.00)
- Overview > Fact Book (0.34)
- Leisure & Entertainment > Sports > Football (1.00)
- Health & Medicine > Therapeutic Area (1.00)
MITRE ATT&CK Applications in Cybersecurity and The Way Forward
Jiang, Yuning, Meng, Qiaoran, Shang, Feiyang, Oo, Nay, Minh, Le Thi Hong, Lim, Hoon Wei, Sikdar, Biplab
The MITRE ATT&CK framework is a widely adopted tool for enhancing cybersecurity, supporting threat intelligence, incident response, attack modeling, and vulnerability prioritization. This paper synthesizes research on its application across these domains by analyzing 417 peer-reviewed publications. We identify commonly used adversarial tactics, techniques, and procedures (TTPs) and examine the integration of natural language processing (NLP) and machine learning (ML) with ATT&CK to improve threat detection and response. Additionally, we explore the interoperability of ATT&CK with other frameworks, such as the Cyber Kill Chain, NIST guidelines, and STRIDE, highlighting its versatility. The paper further evaluates the framework from multiple perspectives, including its effectiveness, validation methods, and sector-specific challenges, particularly in industrial control systems (ICS) and healthcare. We conclude by discussing current limitations and proposing future research directions to enhance the applicability of ATT&CK in dynamic cybersecurity environments.
- Asia > Singapore > Central Region > Singapore (0.04)
- North America > United States > Florida > Orange County > Orlando (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- (14 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Communications > Networks (1.00)
- (8 more...)